Finding rare objects and building pure samples: Probabilistic quasar classification from low resolution Gaia spectra

نویسندگان

  • R. Sordo
  • A. Vallenari
چکیده

We develop and demonstrate a probabilistic method for classifying rare objects in surveys with the particular goal of building very pure samples. It works by modifying the output probabilities from a classifier so as to accommodate our expectation (priors) concerning the relative frequencies of different classes of objects. We demonstrate our method using the Discrete Source Classifier, a supervised classifier currently based on Support Vector Machines, which we are developing in preparation for the Gaia data analysis. DSC classifies objects using their very low resolution optical spectra. We look in detail at the problem of quasar classification, because identification of a pure quasar sample is necessary to define the Gaia astrometric reference frame. By varying a posterior probability threshold in DSC we can trade off sample completeness and contamination. We show, using our simulated data, that it is possible to achieve a pure sample of quasars (upper limit on contamination of 1 in 40 000) with a completeness of 65% at magnitudes of G=18.5, and 50% at G=20.0, even when quasars have a frequency of only 1 in every 2000 objects. The star sample completeness is simultaneously 99% with a contamination of 0.7%. Including parallax and proper motion in the classifier barely changes the results. We further show that not accounting for class priors in the target population leads to serious misclassifications and poor predictions for sample completeness and contamination. We discuss how a classification model prior may, or may not, be influenced by the class distribution in the training data. Our method controls this prior and so allows a single model to be applied to any target population without having to tune the training data and retrain the model.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Molonglo Reference Catalog 1-jy Radio Source Survey Iv: Optical Spectroscopy of a Complete Quasar Sample

Optical spectroscopic data are presented here for quasars from the Molonglo Quasar Sample (MQS), which forms part of a complete survey of 1-Jy radio sources from the Molonglo Reference Catalogue. The combination of low-frequency selection and complete identifications means that the MQS is relatively free from the orientation biases which affect most other quasar samples. To date, the sample inc...

متن کامل

New cataclysmic variables from the RASS

We report on a follow-up study of 15 CV candidates, which were discovered by the ROSAT All-Sky Survey and have been identified on the objective prism plates of the Hamburg Quasar Survey so far. For all objects we could obtain low resolution optical spectra confirming 12 CVs. The misidentifications are two quasars and an M-dwarf.

متن کامل

Properties of galaxies in SDSS Quasar environments at z < 0 . 2

We analyse the environment of low redshift, z < 0.2, SDSS quasars using the spectral and photometric information of galaxies from the Sloan Digital Sky Survey Third Data Release (SDSS-DR3). We compare quasar neighbourhoods with field and high density environments through an analysis on samples of typical galaxies and groups. We compute the surrounding surface number density of galaxies finding ...

متن کامل

Stellar Rotation from GAIA Spectra

Abstract. Stellar rotation influences our understanding of stellar structure and evolution, binary systems, clusters etc. and therefore the benefits of a large and highly accurate database on stellar rotation, obtained by GAIA, will be manifold. To study the prospects of GAIA measurement of projected rotational velocities vrot sin i, we use synthetic stellar spectra to simulate the determinatio...

متن کامل

Probing Reionization with Quasar Spectra: the Impact of the Intrinsic Lyman-α Emission Line Shape Uncertainty

Arguably the best hope of understanding the tail end of the reionization of the intergalactic medium (IGM) at redshift z > 6 is through the detection and characterization of the Gunn-Peterson (GP) damping wing absorption of the IGM in bright quasar spectra. However, the use of quasar spectra to measure the IGM damping wing requires a model of the quasar’s intrinsic Lyman-α emission line. Here w...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008